true

This vignette contains a quick walkthrough of the FOV QC tool’s functions on a small demo dataset.

First we’ll load the necessary data and code:


## source the necessary functions:
source("https://raw.githubusercontent.com/Nanostring-Biostats/CosMx-Analysis-Scratch-Space/Main/_code/FOV%20QC/FOV%20QC%20utils.R")

## load barcodes:
allbarcodes <- readRDS(url("https://github.com/Nanostring-Biostats/CosMx-Analysis-Scratch-Space/raw/Main/_code/FOV%20QC/barcodes_by_panel.RDS"))

names(allbarcodes)
#> [1] "Hs_IO"    "Hs_UCC"   "Hs_6k"    "Mm_Neuro" "Mm_UCC"
# get the barcodes for the panel we want:
barcodemap <- allbarcodes$Hs_6k
head(barcodemap)
#>    gene                                                barcode
#> 1  A1BG ......YY..........................GG..........GG..GG..
#> 2   A2M ........GG......YY................GG......BB..........
#> 3  AAAS ........BB........RR....GGYY..........................
#> 4  AAK1 YY..........YY......YY............BB..................
#> 5  AAMP ..........BB..........YY......YY..BB..................
#> 6 AARS1 ......GGYY............BB............BB................

## load example data: a subset of FOVs and genes from a 6k panel study of breast cancer:
load(url("https://github.com/Nanostring-Biostats/CosMx-Analysis-Scratch-Space/raw/Main/_code/FOV%20QC/FOV%20QC%20example%20data.RData"))

# data structure:
str(counts)
#> Loading required package: Matrix
#> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
#>   ..@ i       : int [1:1886888] 0 1 2 3 4 5 6 8 9 10 ...
#>   ..@ p       : int [1:401] 0 20788 40939 60808 81046 89942 99749 115292 123472 140353 ...
#>   ..@ Dim     : int [1:2] 23000 400
#>   ..@ Dimnames:List of 2
#>   .. ..$ : chr [1:23000] "c_5_17_455" "c_5_2_1656" "c_5_16_630" "c_5_15_1161" ...
#>   .. ..$ : chr [1:400] "NDUFA3" "B2M" "MHC I" "TMSB4X" ...
#>   ..@ x       : num [1:1886888] 14 13 12 7 37 7 1 25 22 10 ...
#>   ..@ factors : list()
counts[1:5, 1:5]
#> 5 x 5 sparse Matrix of class "dgCMatrix"
#>             NDUFA3 B2M MHC I TMSB4X IGHG1/2
#> c_5_17_455      14  13    12     10       3
#> c_5_2_1656      13   5     4      1       .
#> c_5_16_630      12  30    17     10       .
#> c_5_15_1161      7  28    13     21       .
#> c_5_14_1219     37   5    10      7       .
head(xy)
#>             x_slide_mm y_slide_mm
#> c_5_17_455    21.85437   2.444147
#> c_5_2_1656    20.89152   1.668696
#> c_5_16_630    21.41895   2.427428
#> c_5_15_1161   21.37084   1.877383
#> c_5_14_1219   21.53082   1.204412
#> c_5_2_931     20.69655   1.864153
head(fov)
#> [1] 17  2 16 15 14  2
# (Note : the above data objects can easily be extracted from a Giotto or Seurat or tileDB object. 
# See the docs for whichever system you're using for how to extract count matrices and metadata columns.
# Note that this code expects a count matrix with cells in rows & genes in columns, and 
# some toolkits store a transposed version of this matrix. In this case, use Matrix::t() to transpose
# your sparse counts matrix.)

Now we run the FOV QC function. A single line of code suffices. Note that for demonstration purposes we have chosen very cautious thresholds for flagging FOVs; we generally recommend using the default thresholds.

# run the method:
res <- runFOVQC(counts = counts, xy = xy, fov = fov, barcodemap = barcodemap,
                max_prop_loss = 0.3, max_totalcounts_loss = 0.3)  # small values chosen to generate more errors for the demonstration. 
#> The following FOVs failed QC for one or more barcode positions: 16, 19, 18
str(res)
#> List of 11
#>  $ flaggedfovs                  : chr [1:3] "16" "19" "18"
#>  $ flaggedfovs_fortotalcounts   : chr [1:2] "16" "19"
#>  $ flaggedfovs_forbias          : chr [1:3] "16" "19" "18"
#>  $ flagged_fov_x_gene           : chr [1:5571, 1:2] "16" "16" "16" "16" ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : NULL
#>   .. ..$ : chr [1:2] "fov" "gene"
#>  $ flags_per_fov_x_reportercycle: num [1:9, 1:27] 0 0 0 0 0 0 0.25 0 0 0 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:9] "17" "2" "16" "15" ...
#>   .. ..$ : chr [1:27] "reportercycle1" "reportercycle2" "reportercycle3" "reportercycle4" ...
#>  $ fovstats                     :List of 4
#>   ..$ flag     : int [1:9, 1:108] 0 0 0 0 0 0 0 0 0 0 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:9] "17" "2" "16" "15" ...
#>   .. .. ..$ : chr [1:108] "reportercycle1B" "reportercycle1G" "reportercycle1Y" "reportercycle1R" ...
#>   ..$ bias     : num [1:9, 1:108] -0.1521 -0.0373 0.077 0.0252 0.0901 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:9] "17" "2" "16" "15" ...
#>   .. .. ..$ : chr [1:108] "reportercycle1B" "reportercycle1G" "reportercycle1Y" "reportercycle1R" ...
#>   ..$ p        : num [1:9, 1:108] 3.87e-08 1.70e-01 4.82e-03 3.54e-01 9.88e-04 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:9] "17" "2" "16" "15" ...
#>   .. .. ..$ : chr [1:108] "reportercycle1B" "reportercycle1G" "reportercycle1Y" "reportercycle1R" ...
#>   ..$ propagree: num [1:9, 1:108] 0.5306 0.2041 0 0.1429 0.0408 ...
#>   .. ..- attr(*, "dimnames")=List of 2
#>   .. .. ..$ : chr [1:9] "17" "2" "16" "15" ...
#>   .. .. ..$ : chr [1:108] "reportercycle1B" "reportercycle1G" "reportercycle1Y" "reportercycle1R" ...
#>  $ resid                        : num [1:438, 1:108] 0.25069 0.07905 -0.00768 0.25014 0.10722 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:438] "17_(21.81,21.88]_(2.38,2.46]" "2_(20.86,20.93]_(1.65,1.73]" "16_(21.37,21.44]_(2.38,2.46]" "15_(21.37,21.44]_(1.87,1.95]" ...
#>   .. ..$ : chr [1:108] "reportercycle1B" "reportercycle1G" "reportercycle1Y" "reportercycle1R" ...
#>  $ totalcountsresids            : num [1:438(1d)] -0.108 0.557 -0.699 0.273 0.769 ...
#>   ..- attr(*, "dimnames")=List of 1
#>   .. ..$ gridinfo$gridid: chr [1:438] "17_(21.81,21.88]_(2.38,2.46]" "2_(20.86,20.93]_(1.65,1.73]" "16_(21.37,21.44]_(2.38,2.46]" "15_(21.37,21.44]_(1.87,1.95]" ...
#>  $ gridinfo                     :List of 2
#>   ..$ gridid : chr [1:23000] "17_(21.81,21.88]_(2.38,2.46]" "2_(20.86,20.93]_(1.65,1.73]" "16_(21.37,21.44]_(2.38,2.46]" "15_(21.37,21.44]_(1.87,1.95]" ...
#>   ..$ gridfov: Named chr [1:441] "17" "17" "17" "17" ...
#>   .. ..- attr(*, "names")= chr [1:441] "17_(21.81,21.88]_(2.38,2.46]" "17_(21.81,21.88]_(2.31,2.38]" "17_(21.66,21.74]_(2.17,2.24]" "17_(21.88,21.96]_(2.38,2.46]" ...
#>  $ xy                           : num [1:23000, 1:2] 21.9 20.9 21.4 21.4 21.5 ...
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:23000] "c_5_17_455" "c_5_2_1656" "c_5_16_630" "c_5_15_1161" ...
#>   .. ..$ : chr [1:2] "x_slide_mm" "y_slide_mm"
#>  $ fov                          : chr [1:23000] "17" "2" "16" "15" ...

Parts of the output provide a high-level view of the impacted FOVs and genes:

# which FOVs were flagged:
res$flaggedfovs
#> [1] "16" "19" "18"
# show them in space:
mapFlaggedFOVs(res)

# which genes are involved in the flagged bits in those FOVs:
head(res$flagged_fov_x_gene)
#>      fov  gene      
#> [1,] "16" "AAK1"    
#> [2,] "16" "ABCA2"   
#> [3,] "16" "ABCC4"   
#> [4,] "16" "ABCC8"   
#> [5,] "16" "ABCD1"   
#> [6,] "16" "ABRAXAS1"
# show the flagged genes (genes losing just one reporter cycle impacts a lot of genes):
head(res$flagged_fov_x_gene[, "gene"])
#> [1] "AAK1"     "ABCA2"    "ABCC4"    "ABCC8"    "ABCD1"    "ABRAXAS1"
# count how many genes were impacted in one or more flagged FOVs:
length(unique(res$flagged_fov_x_gene[, "gene"]))
#> [1] 1749

Now we’ll go into detail, dissecting the causes of FOV flags. We’ll start by looking at signal loss:

# Look for loss in total signal strength:
FOVSignalLossSpatialPlot(res, shownames = TRUE) 

Two FOVs with clear signal loss have been flagged; another with more marginal signal loss has passed.

Now we’ll look for FOVs with biased expression profiles. Below we draw a heatmap of estimated bias suffered for each FOV * barcode bit (only flagged FOV * bits are colored); Use this view to peek under the hood at the intermediate results used to flag individual FOVs.

FOVEffectsHeatmap(res) 

We see FOV 19 has many flagged bits, though most are from just one color from a reporter cycle. Biological variability can cause single bits to be flagged, so we only flag FOVs in which at least two bits/colors from a single reporter cycle appear anomalous. However, all 4 bits of reporter cycle 12 are flagged in FOV 19, so we fail that FOV. Similarly, we fail FOV 17 for having 3 failed bits in both reporter cycle 18 and 27, and FOV 18 for failing all 4 bits of reporter cycle 18. FOVs 17 and 14 have sporadic bits flagged, but not consistently within a reporter cycle, so we pass them.

Next we draw spatial plots of per-bit FOV effects: These plots show all bits from flagged reporter cycles (only reporter cycles with >= 2 flagged bits in a single FOV get shown): Here we divide the tissue into sub-FOV grids, and we color each grid square by its change in a barcode bit’s total gene expression from similar grid squares from other FOVs. FOVs flagged for the bit are highlighted yellow. Grid squares without enough information are omitted.

par(mfrow = c(2,2))
FOVEffectsSpatialPlots(res = res, outdir = NULL, bits = "flagged_reportercycles") 

In the above plots, FOV failures tend to be unambiguous: it’s very clear when all 4 bits from a reporter cycle are lost in an impacted FOV.

As a contrast, we can plot a flagged bit that appears more driven by spatial biology than by FOV artifacts. Although one FOV was flagged for this bit, no other bits from this reporter cycle have been flagged in that FOV, so this result did not cause any FOVs to fail.

par(mfrow = c(1,1))
FOVEffectsSpatialPlots(res = res, outdir = NULL, bits = "reportercycle22R")